Predicting word accuracy for the automatic speech recognition of non-native speech
نویسندگان
چکیده
We have developed an automated method that predicts the word accuracy of a speech recognition system for non-native speech, in the context of speaking proficiency scoring. A model was trained using features based on speech recognizer scores, function word distributions, prosody, background noise, and speaking fluency. Since the method was implemented for non-native speech, fluency features, which have been used for non-native speakers’ proficiency scoring, were implemented along with several feature groups used from past research. The fluency features showed promising performance by themselves, and improved the overall performance in tandem with other more traditional features. A model using stepwise regression achieved a correlation with word accuracy rates of 0.76, compared to a baseline of 0.63 using only confidence scores. A binary classifier for placing utterances in high-or low-word accuracy bins achieved an accuracy of 84%, compared to a majority class baseline of 64%.
منابع مشابه
Allophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملAutomatic pronunciation scoring of words and sentences independent from the non-native's first language
This paper describes an approach for automatic scoring of pronunciation quality for non-native speech. It is applicable regardless of the foreign language student’s mother tongue. Sentences and words are considered as scoring units. Additionally, mispronunciation and phoneme confusion statistics for the target language phoneme set are derived from human annotations and word level scoring result...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملFuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition
In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...
متن کامل